AITopics

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.94)

Neural Information Processing SystemsApr-25-2026, 13:34:36 GMT

3dc4876f3f08201c7c76cb71fa1da439-Supplemental.pdf

artificial intelligence, cov, machine learning, (17 more...)

Country: North America > United States (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Neural Information Processing SystemsFeb-12-2026, 11:46:38 GMT

e467582d42d9c13fa9603df16f31de6d-Supplemental-Datasets_and_Benchmarks.pdf

auxiliary task, prediction, protein sequence, (14 more...)

Country:

North America > Canada > Quebec > Montreal (0.05)
Asia > China > Shanghai > Shanghai (0.05)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsFeb-8-2026, 10:36:20 GMT

AppendixofFunctionallyRegionalizedKnowledge TransferforLow-resourceDrugDiscovery

For FC-individual, we train each testing assay separately with a two-layer fully-connected base learner. For FC-All, a two-layer fully connected model is trained on samples from both support setandquery setofsource assays andfrom thesupport setofthetargetassay. C.1 DrugActivityPredictionData For drug activity prediction, here we summarized the number of assays belonging to each target family: GPCR (685), Ion channel (215), Kinase (665), NHR (123), Binding (2523), Phenotypic (2299), Functional (1689), Proteinase (289),ADME (55).

artificial intelligence, machine learning, meta-validation, (5 more...)

Country: North America > United States > California > Santa Clara County > Palo Alto (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.94)

Neural Information Processing SystemsNov-13-2025, 23:23:02 GMT

3dc4876f3f08201c7c76cb71fa1da439-Supplemental.pdf

artificial intelligence, cov, machine learning, (17 more...)

Country:

North America > United States > Pennsylvania (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > China > Hong Kong (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

arXiv.org Artificial IntelligenceOct-29-2025

Low-N Protein Activity Optimization with FolDE

Roberts, Jacob B., Ji, Catherine R., Donnell, Isaac, Young, Thomas D., Pearson, Allison N., Hudson, Graham A., Keiser, Leah S., Wesselkamper, Mia, Winegar, Peter H., Ludwig, Janik, Klass, Sarah H., Sheth, Isha V., Ukabiala, Ezechinyere C., Astolfi, Maria C. T., Eysenbach, Benjamin, Keasling, Jay D.

Proteins are traditionally optimized through the costly construction and measurement of many mutants. Active Learning-assisted Directed Evolution (ALDE) alleviates that cost by predicting the best improvements and iteratively testing mutants to inform predictions. However, existing ALDE methods face a critical limitation: selecting the highest-predicted mutants in each round yields homogeneous training data insufficient for accurate prediction models in subsequent rounds. Here we present FolDE, an ALDE method designed to maximize end-of-campaign success. In simulations across 20 protein targets, FolDE discovers 23% more top 10% mutants than the best baseline ALDE method (p=0.005) and is 55% more likely to find top 1% mutants. FolDE achieves this primarily through naturalness-based warm-starting, which augments limited activity measurements with protein language model outputs to improve activity prediction. We also introduce a constant-liar batch selector, which improves batch diversity; this is important in multi-mutation campaigns but had limited effect in our benchmarks. The complete workflow is freely available as open-source software, making efficient protein optimization accessible to any laboratory.

artificial intelligence, machine learning, optimization problem, (15 more...)

2510.24053

Country:

Europe (0.68)
North America > United States > California (0.28)

Genre:

Research Report > New Finding (0.48)
Research Report > Experimental Study (0.34)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Energy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

van Straten, Sjoerd, Padella, Alessandro, Hassani, Marwan

Leveraging Data Augmentation and Siamese Learning for Predictive Process Monitoring

arXiv.org Artificial IntelligenceSep-15-2025

Predictive Process Monitoring (PPM) enables forecasting future events or outcomes of ongoing business process instances based on event logs. However, deep learning PPM approaches are often limited by the low variability and small size of real-world event logs. To address this, we introduce SiamSA-PPM, a novel self-supervised learning framework that combines Siamese learning with Statistical Augmentation for Predictive Process Monitoring. It employs three novel statistically grounded transformation methods that leverage control-flow semantics and frequent behavioral patterns to generate realistic, semantically valid new trace variants. These augmented views are used within a Siamese learning setup to learn generalizable representations of process prefixes without the need for labeled supervision. Extensive experiments on real-life event logs demonstrate that SiamSA-PPM achieves competitive or superior performance compared to the SOTA in both next activity and final outcome prediction tasks. Our results further show that statistical augmentation significantly outperforms random transformations and improves variability in the data, highlighting SiamSA-PPM as a promising direction for training data enrichment in process prediction.

artificial intelligence, deep learning, machine learning, (14 more...)

2507.18293

Country: Europe (0.28)

Genre: Research Report > New Finding (0.66)

Industry:

Health & Medicine (0.47)
Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsAug-19-2025, 13:57:54 GMT

PEER: A Comprehensive and Multi-Task Benchmark for Protein Sequence Understanding (Supplementary Material)

For example, the feature of dipeptide " st " is defined by its dipeptide composition ( The Moran feature descriptor defines the distribution of amino acid properties along a protein sequence. It should be noted that there are evident class imbalances in two multi-class classification tasks. Table 1: Balanced metric (weighted F1) compared with accuracy on multi-class classification tasks. We report mean (std) for each experiment. Used as a feature extractor with pre-trained weights frozen.

artificial intelligence, machine learning, prediction, (14 more...)

Country:

North America > Canada > Quebec > Montreal (0.05)
Asia > China > Shanghai > Shanghai (0.05)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceJul-21-2025

DailyLLM: Context-Aware Activity Log Generation Using Multi-Modal Sensors and LLMs

Tian, Ye, Ren, Xiaoyuan, Wang, Zihao, Gungor, Onat, Yu, Xiaofan, Rosing, Tajana

Rich and context-aware activity logs facilitate user behavior analysis and health monitoring, making them a key research focus in ubiquitous computing. The remarkable semantic understanding and generation capabilities of Large Language Models (LLMs) have recently created new opportunities for activity log generation. However, existing methods continue to exhibit notable limitations in terms of accuracy, efficiency, and semantic richness. To address these challenges, we propose DailyLLM. To the best of our knowledge, this is the first log generation and summarization system that comprehensively integrates contextual activity information across four dimensions: location, motion, environment, and physiology, using only sensors commonly available on smartphones and smartwatches. To achieve this, DailyLLM introduces a lightweight LLM-based framework that integrates structured prompting with efficient feature extraction to enable high-level activity understanding. Extensive experiments demonstrate that DailyLLM outperforms state-of-the-art (SOTA) log generation methods and can be efficiently deployed on personal computers and Raspberry Pi. Utilizing only a 1.5B-parameter LLM model, DailyLLM achieves a 17% improvement in log generation BERTScore precision compared to the 70B-parameter SOTA baseline, while delivering nearly 10x faster inference speed.

large language model, machine learning, natural language, (17 more...)

2507.13737

Genre: Research Report > New Finding (0.66)

Industry:

Information Technology (1.00)
Health & Medicine > Therapeutic Area (0.94)
Health & Medicine > Consumer Health (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceJul-4-2025

RLHGNN: Reinforcement Learning-driven Heterogeneous Graph Neural Network for Next Activity Prediction in Business Processes

Wang, Jiaxing, Yu, Yifeng, Song, Jiahan, Cao, Bin, Fan, Jing, Zhang, Ji

--Next activity prediction represents a fundamental challenge for optimizing business processes in service-oriented architectures such as microservices environments, distributed enterprise systems, and cloud-native platforms, which enables proactive resource allocation and dynamic service composition. Despite the prevalence of sequence-based methods, these approaches fail to capture non-sequential relationships that arise from parallel executions and conditional dependencies. Even though graph-based approaches address structural preservation, they suffer from homogeneous representations and static structures that apply uniform modeling strategies regardless of individual process complexity characteristics. T o address these limitations, we introduce RLHGNN, a novel framework that transforms event logs into heterogeneous process graphs with three distinct edge types grounded in established process mining theory. Our approach creates four flexible graph structures by selectively combining these edges to accommodate different process complexities, and employs reinforcement learning formulated as a Markov Decision Process to automatically determine the optimal graph structure for each specific process instance. RLHGNN then applies heterogeneous graph convolution with relation-specific aggregation strategies to effectively predict the next activity. This adaptive methodology enables precise modeling of both sequential and non-sequential relationships in service interactions. Comprehensive evaluation on six real-world datasets demonstrates that RLHGNN consistently outperforms state-of-the-art approaches. Furthermore, it maintains an inference latency of approximately 1 ms per prediction, representing a highly practical solution suitable for real-time business process monitoring applications. Service-oriented architectures have fundamentally transformed modern business process implementation, which enables distributed services to coordinate through well-defined interfaces for delivering substantial business value [1], [2]. Jiaxing Wang, Yifeng Y u, Jiahan Song, Bin Cao, and Jing Fan are with the College of Computer Science and Technology, Zhejiang University of Technology, 310023, Hangzhou, China, and also with Zhejiang Key Laboratory of Visual Information Intelligent Processing, 310023, Hangzhou, China (email: wjx@zjut.edu.cn,

graph structure, machine learning, reinforcement learning, (20 more...)

2507.0269

Country:

Asia > China > Zhejiang Province > Hangzhou (0.44)
Europe > Austria > Vienna (0.14)
Oceania > Australia > Victoria > Melbourne (0.04)
(5 more...)

Genre: Research Report > Promising Solution (0.34)

Industry:

Health & Medicine (0.67)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Architecture (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)